Spoken language identification on 4 Indonesian local languages using deep learning
نویسندگان
چکیده
Language identification is at the forefront of assistance in many applications, including multilingual speech systems, spoken language translation, recognition, and human-machine interaction via voice. The indonesian local languages using technology has enormous potential to advance tourism digital content Indonesia. goal this study identify four Indonesian languages: Javanese, Sundanese, Minangkabau, Buginese, utilizing deep learning classification techniques such as artificial neural network (ANN), convolutional (CNN), long-term short memory (LSTM). selected extraction feature for audio data employs mel-frequency cepstral coefficient (MFCC). results showed that LSTM model had highest accuracy each duration (3 s, 10 30 s), followed by CNN ANN models.
منابع مشابه
Deep learning for spoken language identification
Empirical results have shown that many spoken language identification systems based on hand-coded features perform poorly on small speech samples where a human would be successful. A hypothesis for this low performance is that the set of extracted features is insufficient. A deep architecture that learns features automatically is implemented and evaluated on several datasets.
متن کاملDeep Bottleneck Features for Spoken Language Identification
A key problem in spoken language identification (LID) is to design effective representations which are specific to language information. For example, in recent years, representations based on both phonotactic and acoustic features have proven their effectiveness for LID. Although advances in machine learning have led to significant improvements, LID performance is still lacking, especially for ...
متن کاملTask-aware deep bottleneck features for spoken language identification
Recently, deep bottleneck features (DBF) extracted from a deep neural network (DNN) containing a narrow bottleneck layer, have been applied for language identification (LID), and yield significant performance improvement over state-of-the-art methods on NIST LRE 2009. However, the DNN is trained using a large corpus of specific language which is not directly related to the LID task. More recent...
متن کاملSpoken Emotion Recognition Using Deep Learning
Spoken emotion recognition is a multidisciplinary research area that has received increasing attention over the last few years. In this paper, restricted Boltzmann machines and deep belief networks are used to classify emotions in speech. The motivation lies in the recent success reported using these alternative techniques in speech processing and speech recognition. This classifier is compared...
متن کاملSpoken language identification using the speechdat corpus
Current language identification systems vary significantly in their complexity. The systems that use higher level linguistic information have the best performance. Nevertheless, that information is hard to collect for each new language. The system presented in this paper is easily extendable to new languages because it uses very little linguistic information. In fact, the presented system needs...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bulletin of Electrical Engineering and Informatics
سال: 2022
ISSN: ['2302-9285']
DOI: https://doi.org/10.11591/eei.v11i6.4166